Method of storing and retrieving miniaturised data

ABSTRACT

A method of storing data including the steps of providing a first index of first location identifiers, a second index of second location identifiers and a dictionary data base of data items, wherein the first location identifiers are adapted to identify the location of second location identifiers in the second index and the second location identifiers are adapted to identify the location of data items in the dictionary data base, receiving data and separating the data into a plurality of data items and storing the data items in a main data base, whereby at least one of the data items is stored in the main data base as at least one first location identifier, which identifies at least one second location identifier, which identifies the or each data item in the dictionary data base.

FIELD OF THE INVENTION

[0001] The present invention relates primarily, although notexclusively, to techniques used for storing electronic data in a formwhich requires less storage space.

BACKGROUND OF THE INVENTION

[0002] A typical method for compressing data involves the use of adictionary data base which lists commonly occurring data and replacesthis commonly occurring data with a coded “token” which effectivelyrepresents that data using a reduced number of data bits.

[0003] Whenever an item of data occurs repeatedly this data item isreplaced by its equivalent “token” and accordingly that data item isstored in a compressed form.

[0004] When data is stored in the compressed form, by using a look-uptable each token can be replaced by its equivalent data item so that theoriginal data can be reformed.

[0005] The above conventional compression technique has a number ofdrawbacks. These drawbacks include the number of data bits which arerequired to represent a token can also be significant with the resultthat significant storage space is required to store each token. Inaddition searching a data base which includes tokens can be quitecumbersome because tokens need to be reconverted to their original dataitem before a search of each of the data items can be properlyconducted.

[0006] Object of the Invention

[0007] The present invention provides an alternative to existing methodsof storing data in a miniaturised form and extends to methods forencrypting data as well as systems for implementing the method, computerprograms and storage medium for storing electronic data which is able toimplement the method and system.

[0008] Disclosure of the Invention

[0009] According to the present invention there is provided a method ofstoring data including the steps of providing a first index of firstlocation identifiers, a second index of second location identifiers anda dictionary data base of data items, wherein the first locationidentifiers are adapted to identify the location of second locationidentifiers in the second index and the second location identifiers areadapted to identify the location of data items in the dictionary database, receiving data and separating the data into a plurality of dataitems and storing the data items in a main data base, whereby at leastone of the data items is stored in the main data base as at least onefirst location identifier, which identifies at least one second locationidentifier, which identifies the or each data item in the dictionarydata base.

[0010] According to another aspect of the present invention there isprovided a method of retrieving data stored in a miniaturised form in amain data base, including the steps of accessing the main data base,retrieving one or more items of data including at least one firstlocation identifier from the main data base, using the first locationidentifier to access and retrieve the location of a second locationidentifier identified in the first index by the first locationidentifier, accessing and retrieving from the second location identifierin the second index the location of an item of data in a dictionary database.

[0011] It is preferred that the method of storing data includes the stepof searching the dictionary data base for at least one data item andreplacing the data item with one first location identifier whichindicates the location of one location identifier in the second index,which second location identifier indicates the location of the data itemin the dictionary data base.

[0012] It is preferred that the method includes the step of searchingthe dictionary data base for each data item and identifying if the dataitem occurs in the dictionary data base and if the data item occurs inthe dictionary data base, retrieving the second location identifier inthe second index that identifies the location of the data item in thedictionary data base, retrieving the first location identifier in thefirst index which identifies the location of the second locationidentifier in the second index and storing the first location identifierin a main data base in place of the data item.

[0013] It is preferred that the data item includes a string of data, afield of data or other group of data that can represent information in apredetermined format.

[0014] The or each data item preferably represents a stream of datawhich represents information which can be searched.

[0015] Each data item preferably represents a name, initial, address,phone number or other words or numbers or initials or characters orcharacter strings or number strings.

[0016] Each first location identifier preferably includes a pointer tothe second index.

[0017] Each second location identifier may include a pointer to thedictionary data base.

[0018] The first index may comprise a plurality of pointers.

[0019] Preferably the second index comprises a plurality of pointers.

[0020] The first index may comprise a sequential list of pointers.

[0021] The second index may comprise a sequential list of pointers.

[0022] The dictionary data base preferably includes a plurality of databases each with unique addresses which are represented by the locationidentifiers.

[0023] Each index may include a plurality of sub-indexes.

[0024] Preferably each second index is divided into different sectionsrepresenting locations of predetermined types of data items.

[0025] Each first index is preferably divided into different sectionsrepresenting the location of second location identifiers associated withpredetermined types of data items.

[0026] The method may include providing additional indexes withadditional location identifiers.

[0027] According to another aspect of the present invention there isprovided a system for storing data, the system including at least onedictionary data base and at least two index data bases wherein thedictionary data base comprises a plurality of data items, a first one ofthe index data bases comprising a plurality of data item locationidentifiers, which respectively identify the location of at least onedata item in the dictionary data base and a second one of the index databases including a plurality of first location identifiers whichrespectively identify the location of at least one data item locationidentifier in the first index data base, and wherein the system includesa processing means which is adapted to receive data including data itemsand to store the data in a compressed form by storing in place of eachdata item occurring in the dictionary data base, each correspondingfirst location identifier, whereby each data item occurring in thedictionary data base can be retrieved by referencing the data itemlocation identifier identified by the first location identifier.

[0028] Preferably the at least two index databases include separatelists of location identifiers in a common data base or which are part ofother data bases.

[0029] Preferably there is provided a storage medium including asequence of instructions adapted to control a data processor to set upthe system.

[0030] The first index data base may be part of the dictionary database.

[0031] The system may include one or more additional index data baseseach with location identifiers which identify the location of anotherlocation identifier of another index data base.

[0032] It is preferred that the system includes a main data base whichis adapted to store a stream of data as a combination of data itemswhich are not represented in the dictionary data base and first locationidentifiers.

[0033] According to one embodiment the stream of data stored in the maindata base may have data items and first location identifiers which arestored in an order determined by a further index data base and areprocessing means which is adapted to control the ordering of data inthe main data base with reference to the further index data base.

[0034] Preferably the dictionary data base has data items stored in apredetermined order which is determined by how commonly or frequentlyeach data item stored therein is expected to occur in a data stream ofdata items.

[0035] It is preferred that the most common data items have a locationin the dictionary data base that is identified by a dictionary data baselocation identifier having minimal bits compared to an uncommon dataitem.

[0036] Preferably the dictionary data base index comprises dictionarydata base location identifiers arranged sequentially from lowest numberto highest number of bits required to define them.

[0037] Each first location identifier may comprise a pointer having anumber which identifies a position of one data item location identifierin the dictionary data base index.

[0038] Preferably each data item location identifier comprises a pointerhaving a number which identifies the position of one data item in thedictionary data base.

[0039] The dictionary data base may be divided into different sectionswhich have data items with locations which are identified by data itemlocation identifiers from different dictionary data base indexes.

[0040] The dictionary data base preferably includes storage space intowhich data items can be added.

[0041] According to another aspect of the present invention there isprovided a computer program which is adapted to control a computer toprovide at least one dictionary data base and at least two index databases, wherein the dictionary data base comprises a plurality of dataitems, a first one of the index data bases comprises a plurality of dataitem location identifiers, which respectively identify the location ofat least one data item in the dictionary data base, and a second one ofthe index data bases includes a plurality of first location identifierswhich respectively identify the location of at least one data itemlocation identifier in the first index data base, and wherein thecomputer program includes instructions to control the computer toreceive data including data items and to store the data in a compressedform by storing in place of each data item which occurs in thedictionary data base, each corresponding first location identifier,whereby each data item which occurs in the dictionary data base can beretrieved by referencing the data item location identifier identified bythe first location identifier.

[0042] Preferably the at least two index data bases include separatelists of location identifiers in a common data base or which are part ofother data bases.

[0043] The first index data base may be part of the dictionary database.

[0044] It is preferred that the computer program includes an exceptionmeans for storing data items which do not occur in the dictionary database.

[0045] The exception means preferably includes a predetermined part ofthe dictionary data base.

[0046] The exception means may be adapted to provide a dictionary indexlocation identifier for any new data item stored in the dictionary database.

[0047] According to another embodiment of the present invention theexception means includes an exceptions data base which is adapted tostore data items which do not occur in the dictionary data base.

[0048] The computer program may include a means for storing differenttypes of data items in different dictionary data bases.

[0049] Each dictionary data base may have predetermined dictionarylocation identifiers in the first one of the index data bases, whichprovide the location of data items in that data base.

[0050] Each dictionary data base may be split into a plurality of datatypes or fields each having a plurality of data items of that type orfield.

[0051] According to another aspect of the present invention there isprovided a storage medium having computer software stored thereon whichis adapted to control a computer to set up a system according to any oneof the previously described embodiments of the invention.

[0052] According to another aspect of the present invention there isprovided a system for retrieving data items stored in a miniaturisedform, the system including at least one dictionary data base and atleast two index data bases wherein the dictionary data base comprises aplurality of data items and a first one of the index data basescomprises a plurality of data item location identifiers whichrespectively identify the location of at least one data item in thedictionary data base and a second one of the index data bases includes aplurality of first location identifiers which respectively identify thelocation of at least one data item location identifier in the firstindex data base and a processing means, wherein the processing means isadapted to receive a first data stream including a plurality of firstlocation identifiers and produce a second data stream including the dataitems without first location identifiers and wherein first locationidentifiers are replaced by corresponding data items.

[0053] It is preferred that the first data stream is adapted to bereceived by reading a data base.

[0054] The data base may be in a storage medium which is readable by acomputer hardware device.

[0055] The data base may be stored in a computer memory.

[0056] The data stream preferably is transmitted and received from acommunication system.

[0057] The first data stream may be received and stored in a dataprocessor before being read and compressed or decompressed.

[0058] According to another aspect of the present invention there isprovided a system which includes the system for compressing data and thesystem for retrieving data.

[0059] According to another embodiment of the present invention there isprovided a storage medium having a computer program stored thereon whichis adapted to control a computer to set up/implement the combinedsystem.

[0060] According to another aspect of the present invention there isprovided a method of encrypting data using the system for compressingdata.

[0061] According to one embodiment of the method of encrypting data, thelocation of data items may be changed by using a coding means forchanging the data item location identifiers in a reconvertible manner.

[0062] It is preferred that the data item location identifiers are ableto be reordered so that the first location identifiers identifydifferent data item location identifiers to those before reordering.

[0063] According to another embodiment of the present invention any oneof the systems includes a scrambling means for reordering data itemlocation identifiers in the first index data and for storing the methodof reordering whereby the reordering can be reversed.

[0064] According to another aspect of the present invention there isprovided a method of decryption using the system for retrieving data.

[0065] It is preferred that the method of decryption includes thescrambling means for reversing any reordering which has taken place ofthe dictionary item location identifiers.

[0066] According to another embodiment of the present invention themethod of decryption includes a descrambling means which includes meansfor reversing any reordering of dictionary item location identifiers inthe first index data base.

[0067] According to another embodiment of the present invention there isprovided a method of encrypting and decrypting data which incorporatesthe combined system for compressing and retrieving data, the method alsoincluding the step of at predetermined times using a scrambling means toreorder dictionary item location identifiers in accordance with apredetermined ordering technique which is stored or able to be storedand received by a descrambling means at a receiving end of the system.

[0068] A preferred embodiment of the present invention will now bedescribed by way of example only with reference to HTML script or text.

BEST MODE OF CARRYING OUT THE INVENTION

[0069] As an example the following HTML text will be minimised inaccordance with the preferred embodiment of the invention:

[0070] The Frontpage install <!--webbot=“PurpleText” preview=“This pageis created in the root directory of your FrontPage when FrontPage isinstalled. It contains information that allows users to edit pages inyour web using the Microsoft Web Publishing Wizard or programs which usethe Microsoft Web Publishing Wizard such as FrontPad using the sameusername and password they would use if they were authorising withMicrosoft FrontPage. If you so not want to allow users to edit files onthis web using tools other than Microsoft FrontPage, you can delete thisfile”.

[0071] The above text can be split into a number of groups which forconvenience will be referred to as data items. Thus the word “the”constitutes one data item, the word “frontpage” constitutes another dataitem and so on for the word “install”, “<!--webbot bot=” and“PurpleText”. In a typical situation each of the above data items wouldbe stored in a data base and the space required to store this data isaccordingly no longer available to store other data.

[0072] Using the miniaturisation technique in accordance with thepresent invention two indexing lists are set up as shown in FIG. 1.

[0073] A first list 11 is set up which is effectively a data base ofpointers.

[0074] For convenience only some of the pointers are shown, being thosepointers required to identify text which is stored relating to thesample of HTML text referred to above.

[0075] The first list 11 is generated by analysing the repetitivestructure of HTML text and script that exists as documents or datatransfer streams. This list has common HTML text type documents. Allitems that are of repetitive nature that can be identified exist in thislist. This text list could be a super set of other common lists, forexample, the English language list or the French language list.

[0076] A second list 12 contains a dictionary of the HTML text which isto be miniaturised. Each data item is located at a specific position inthe list 12 and this position is identified by a number which is pointedto by a pointer from list 11.

[0077] The list 12 is effectively a dictionary data base which isgenerated by coding the entries in the HTML text and script list.

[0078] The 128 most common items are located first in the list and areassigned first level representation (typically 8 bits) in alphabeticsequence. The rest of the list is organised alphabetically and isassigned the minimum number of bits to uniquely identify the location ofthe original data in the list 11.

[0079] As an example, if the total number of data items (e.g.characters) in the first list is 29,456 then 15 bits (0..32768) would beneeded to represent the unique location of the start of a particulardata item. The number of unique entries is then calculated. If, forexample, there are 3,128 unique entries in the list 11, then 12 bits (0to 4096) will be required to identify the unique data items in the list.

[0080] It follows from the above that by setting up the first list 11 areduced number of pointers are required to represent the data items inthe second list, because data items that are repeated do not need tohave an associated pointer.

[0081] Accordingly if a data item occurs 1,000 times in the second listor dictionary data base, a single pointer is all that is required in thefirst list 11 and accordingly the single pointer is all that needs to bestored in a general data base 13.

[0082] Thus referring back to the example of HTML text given above, theword “the” is the first data item which is to be stored in the generaldata base 13. Because the word “the” is a common word, it thereforeoccurs in the most common section of the second list 12 and may belocated at position 3406. The corresponding pointer from the first list11 may be located at position 8A. Accordingly the word “the” does notneed to be stored in the general data base 13 nor does the second listpointer 3406. Instead the first list pointer 8A can be stored in thegeneral data base 13 and this obviously has a lower number of bitsrequired to describe it and accordingly requires less space for storage.

[0083] The next word in the HTML text is “Frontpage” which is not ascommon as the word “the”, but does exist many times in normal HTML text.It therefore is located in the less common section of the dictionarylist 12 at a location 23456. In the first list 11 location 23456 isrepresented by pointer 2408.

[0084] It follows therefore that pointer 2408 is placed in the generaldata base 13 straight after pointer BA. The word “install” is the nextdata item in the HMTL text and is an uncommon word which is located atposition 26578. The corresponding pointer in first list 11 is located atposition 2458. Accordingly this pointer 2458 is stored in the generaldata base 13 after pointer 2408.

[0085] Finally the script string “<!--webbot bot=” is a very common HTMLscript command and is therefore located at position 4987 in the secondlist. This location 4987 is represented by pointer 8F in the first list11 and accordingly is located in the general data base 13 instead of thescript “<!--webbot bot=”.

[0086] The word “PurpleText” is not common in either HTML script or textand therefore does not occur in the dictionary list 12. As a result thisword is represented by an exception flag “00” in the general data base13 and has no associated pointer. Similarly any other script or textwhich is not represented in the dictionary data base 12, is alsoclassified as an exception and is copied verbatim into the general database 13.

[0087] Reconstruction of the original data represented in the generaldata base is simply achieved by using a reverse look-up algorithm.

[0088] Thus if the pointer 8A is read, a look-up algorithm is used toaccess the first list 11 which gives the location of the correspondingdata item at location 3406 in the second list 12.

[0089] At location 3406 the word “the” is located and this word is thenretrieved and substituted for the pointer 8A.

[0090] The above example discloses what is in effect a double indextechnique, utilising two pointers. However the present invention mayequally be applicable to any number of indexes and pointers, dependingon the data which is to be miniaturised. Thus one application would bein miniaturising data located in telephone white pages. In such asituation a number of dictionary lists would be required, such as anames list, a streets list and a locations list.

[0091] Each of these lists would have their own separate first andsecond list pointers using the examples outlined above. Furthermore,each list could have an associated list which would also require adouble index pointer system.

[0092] Thus a streets list having the names of various streets may alsorequire a sub-list of street types such as “ST”, “PL”, “CR” etc.

[0093] According to another example image data may be represented bymulti-level indexing techniques. Thus the first level may be the factthat the area is black, the second level may indicate the shape, thethird level may indicate the size. Similarly the levels may relate tofurther deconstruction of the original data.

[0094] Clearly the above compression technique is not limited to textbased data, but is also able to be used in connection with foreignlanguages, foreign character sets (e.g. Arabic and Chinese), music andspeech phonemes. The only requirement is that the data has a repetitivenature that can be analysed and represented as uniquely coded andidentifiable items.

[0095] An important advantage of the miniaturisation technique which isdescribed above lies with the ability to search data items in itsminiaturised format. Thus instead of searching for the word “the” in thepreferred embodiment given above, a search could be conducted for thepointer 8A. This is in contrast to conventional searching techniques ofcompressed text, where it is necessary to continually convert andreconvert text in order to complete the search.

[0096] Although the main focus of the present invention isminiaturisation of data, the invention is equally applicable toencrypting/decrypting data. This is because the indexing systemdescribed above in effect replaces common data items with associatedpointers which act as tokens.

[0097] Because each token and data item is easily retrievable, the listof tokens/pointers can easily be manipulated in a reversible manner tomake unauthorised decryption more difficult.

[0098] The present invention is therefore applicable to any data whichincludes repetitive elements. This is because these repetitive elementscan be represented in an index of pointers/tokens which obviate the needfor pointers for each repeated element. It follows therefore thattheoretically any data stored, for example in computer memory can bestored in a miniaturised form by eliminating the majority of repeateddata items.

1. A method of storing data including the steps of providing a firstindex of first location identifiers, a second index of second locationidentifiers and a dictionary data base of data items, wherein the firstlocation identifiers are adapted to identify the location of secondlocation identifiers in the second index and the second locationidentifiers are adapted to identify the location of data items in thedictionary data base, receiving data and separating the data into aplurality of data items and storing the data items in a main data base,whereby at least one of the data items is stored in the main data baseas at least one first location identifier, which identifies at least onesecond location identifier, which identifies the or each data item inthe dictionary data base.
 2. The method as claimed in claim 1 includingthe step of searching the dictionary data base for at least one dataitem and replacing the data item with one first location identifierwhich indicates the location of one location identifier in the secondindex, which second location identifier indicates the location of thedata item in the dictionary data base.
 3. The method as claimed in claim2 including the step of searching the dictionary data base for each dataitem and identifying if the data item occurs in the dictionary data baseand if the data item occurs in the dictionary data base, retrieving thesecond location identifier in the second index that identifies thelocation of the data item in the dictionary data base, retrieving thefirst location identifier in the first index which identifies thelocation of the second location identifier in the second index andstoring the first location identifier in a main data base in place ofthe data item.
 4. The method as claimed in claim 3 wherein the data itemincludes any one of a string of data, a field of data or other group ofdata that can represent information in a predetermined format.
 5. Themethod as claimed in claim 3 wherein the or each data item represents astream of data which represents information which can be searched. 6.The method as claimed in claim 3 wherein each first location identifierincludes a pointer to the second index.
 7. The method as claimed inclaim 6 wherein each second location identifier includes a pointer tothe dictionary data base.
 8. The method as claimed in claim 7 whereinthe first and second indexes comprise a plurality of pointers.
 9. Themethod as claimed in claim 8 wherein the dictionary data base includes aplurality of data bases each with unique addresses which are representedby the location identifiers.
 10. The method as claimed in claim 9wherein each index includes a plurality of sub-indexes.
 11. The methodas claimed in claim 9 wherein each second index is divided intodifferent sections representing locations of predetermined types of dataitems.
 12. The method as claimed in claim 11 wherein each first index isdivided into different sections representing the location of secondlocation identifiers associated with predetermined types of data items.13. A system for storing data, the system including at least onedictionary data base and at least two index data bases wherein thedictionary data base comprises a plurality of data items, a first one ofthe index data bases comprising a plurality of data item locationidentifiers, which respectively identify the location of at least onedata item in the dictionary data base and a second one of the index databases including a plurality of first location identifiers whichrespectively identify the location of at least one data item locationidentifier in the first index data bases and wherein the system includesa processing means which is adapted to receive data including data itemsand to store the data in a compressed form by storing in place of eachdata item occurring in the dictionary data base, each correspondingfirst location identifier, whereby each data item occurring in thedictionary data base can be retrieved by referencing the data itemlocation identifier identified by the first location identifier.
 14. Thesystem as claimed in claim 13 wherein the at least two index data basesinclude separate lists of location identifers in one or more other databases.
 15. The system as claimed in claim 14 including a storage mediumhaving a sequence of instructions adapted to control a data processor toset up the system.
 16. The system as claimed in claim 15 wherein thefirst index data base is part of the dictionary data base.
 17. Thesystem as claimed in claim 14 including one or more additional indexdata bases each with location identifiers which identify the location ofanother location identifier of another index data base.
 18. The systemas claimed in claim 17 including a main data base which is adapted tostore a stream of data as a combination of data items which are notrepresented in the dictionary data base and first location identifiers.19. The system as claimed in claim 18 wherein the stream of data storedin the main data base may have data items and first location identifierswhich are stored in an order determined by a further index data base anda reprocessing means which is adapted to control the ordering of data inthe main data base with reference to the further index data base. 20.The system as claimed in claim 19 wherein the dictionary data base hasdata items stored in a predetermined order which is determined by howfrequently each data items stored therein is expected to occur in a datastream of data items.
 21. The system as claimed in claim 20 wherein themost common data items have a location in the dictionary data base thatis identified by a dictionary data base location identifier havingminimal bytes compared to an uncommon data item.
 22. The system asclaimed in claim 21 wherein the dictionary data base index comprisesdictionary data base location identifiers arranged sequentially fromlowest number to highest number of bytes required to define them. 23.The system as claimed in claim 22 wherein each first location identifiercomprises a pointer having a number which identifies a position of onedata item location identifier in the dictionary data base index.
 24. Thesystem as claimed in claim 23 wherein each data item location identifiercomprises a pointer having a number which identifies the position of onedata item in the dictionary data base.
 25. The system as claimed inclaim 24 wherein the dictionary data base is divided into differentsections which have data items with locations which are identified bydata item location identifiers from different dictionary data baseindexes.
 26. The system as claimed in claim 25 wherein the dictionarydata base includes storage space into which data items can be added. 27.A computer program comprising instructions which are adapted to controla computer to provide at least one dictionary data base and at least twoindex data bases, the computer program including instructions to controlthe computer to provide the dictionary data base with a plurality ofdata items, controlling the computer to provide a first one of the indexdata bases with a plurality of data item location identifiers, whichrespectively identify the location of at least one data item in thedictionary database and controlling the computer to provide a second oneof the index data bases with a plurality of first location identifierswhich respectively identify the location of at least one data itemlocation identifier in the first index data base, controlling thecomputer to receive data including data items and to store the data in acompressed form by storing in place of each data item which occurs inthe dictionary data base, each corresponding first location identifier,whereby each data item which occurs in the dictionary data base can beretrieved by referencing the data item location identifier identified bythe first location identifier.
 28. The computer program as claimed inclaim 27 wherein the at least two index data bases include separatelists of location identifiers in at least one other data base.
 29. Thecomputer program as claimed in claim 28 wherein the first index database is part of the dictionary data base.
 30. The computer program asclaimed in claim 29 including an exception means for storing data itemswhich do not occur in the dictionary data base.
 31. The computer programas claimed in claim 30 wherein the exception means includes apredetermined part of the dictionary data base.
 32. The computer programas claimed in claim 31 wherein the exception means is adapted to providea dictionary index location identifier for any new data items stored inthe dictionary data base.
 33. The computer program as claimed in claim32 wherein the exception means includes an exceptions data base which isadapted to store data items which do not occur in the dictionary database.
 34. The computer program as claimed in claim 34 including a meansfor storing different types of data items in different dictionary databases.
 35. The computer program as claimed in claim 24 wherein eachdictionary data base has predetermined dictionary location identifiersin the first one of the index data bases, which provide the location ofdata items in that data base.
 36. The computer program as claimed inclaim 26 wherein each dictionary data base is split into a plurality ofdata types each having a plurality of data items of that type.
 37. Astorage medium having a computer program as claimed in any one of claims28 to 36 stored thereon.
 38. A system for retrieving data items storedin a miniaturised form, the system including at least one dictionarydata base and at least two index data bases wherein the dictionary database comprises a plurality of data items and a first one of the indexdata bases comprises a plurality of data item location identifiers whichrespectively identify the location of at least one data item in thedictionary data base and a second one of the index data bases includes aplurality of first location identifiers which respectively identify thelocation of at least one data item location identifier in the firstindex data base and a processing means, wherein the processing means isadapted to receive a first data stream including a plurality of firstlocation identifiers and produce a second data stream including the dataitems without the first location identifiers and wherein first locationidentifiers are replaced by corresponding data items.
 39. The system asclaimed in claim 38 wherein the first data stream is adapted to bereceived by reading a data base.
 40. The system as claimed in claim 39wherein the data base is located in a storage medium which is readableby a computer hardware device.
 41. The system as claimed in claim 40wherein the data stream is transmitted and received from acommunications system.
 42. The system as claimed in claim 41 wherein thefirst data stream is received and stored in a data processor beforebeing read and compressed or decompressed.
 43. The system as claimed inclaim 42 including a scrambling means for reordering data item locationidentifiers in the first index data and storing a method of reorderingdata item location identifiers utilised by the scrambling means, wherebyreordering data item location identifiers can be reversed.
 44. Thesystem as claimed in claim 43, wherein the scrambling means includesreversing means for reversing any reordering which has taken place ofthe dictionary item location identifiers.